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Abstract. Many real networks have cliques as their constitutional units. Here we 
present a family of scale-free network model consist of cliques, which is established 
by a simple recursive algorithm. We investigate the networks both analytically and 
numerically. The obtained analytical solution shows that the networks follow a power- 
law degree distribution, with degree exponent continuously tuned between 2 and 3, 
coinciding with the empirically found results. The exact expression of clustering 
coefRcient is also provided for the networks. Furthermore, the investigation of the 
average path length reveals that the networks possess small-world feature. 



PACS numbers: 02.5GCw, 05.45Pq, 89.75.Da, 05.1G.-a 
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1. Introduction 

Over the last few years, it has been suggested that a lot of social, technological, 
biological, and information networks share the following three striking statistical 
characteristics [H El |3l SI IH El [7]: power-law degree distribution [8], high clustering 
coefficient [9], and small average path length (APL). Power-law degree distribution 
indicates that the majority of nodes in such networks have only a few connections to 
other nodes, whereas some nodes are connected to many other nodes in the network. 
Large clustering coefficient implies that nodes having a common neighbor are far more 
likely to be linked to each other than are two nodes selected randomly. Short APL 
shows that the expected number of links needed to pass from one arbitrarily selected 
node to another one is low, that is, APL grows logarithmically with the number of nodes 
or slower. 

Mimicking such complex real-life systems is an important issue. A wide variety 
of models have been proposed [U El El HI El El [7j, among which the most well-known 
successful attempts are the Watts and Strogatz's (WS) small-world network model [9] 
and Barabasi and Albert's (BA) scale-free network model [8], which have attracted 
an exceptional amount of attention from a wide circle of researchers and started an 
avalanche of research on the models of systems within the physics community. After 
that, a considerable number of other models and mechanisms, which may represent 
processes more realistically taking place in real-life systems, have been developed. 
These include nonlinear preferential attachment [10], initial attractiveness [TT], edge 
rewiring [12] and removal [13], aging and cost [14J, competitive dynamics |15j . 
duplication [16j, weight [T71 [18], geographical constraint [HI [201 121], Apollonian 
packing [22l ESI E3 EHl ES EZ] and so forth. 

The above mentioned models and mechanisms may provide valuable insight into 
some particular real-life networks. However, different networks have different creating 
mechanisms, it is almost impossible to mimic all real-life systems based on several special 
models. Thus, it is necessary that we should model peculiar networks according to their 
corresponding generating mechanisms. 

In real-life world, many networks consist of cliques. For example, in movie actor 
collaboration network [9] and science collaborating graph [28] , actors acting in the same 
film or authors signing in the same paper form a clique, respectively. In corporate 
director network [29], directors as members in the same board constitute a clique. 
Analogously, in public transport networks [30], bus (tramway, or underground) stops 
shape a clique if they are consecutive stops on a route, and in the network of concepts 
in written texts [31], words in each sentence in the text is added to the network as 
a clique. All these pose a very interesting and important question of how to build 
evolution models based on this particularity of network component — cliques. 

In this paper, we suggest a growing evolution network model with chques as its 
basic constitutional units, giving high general versatility for growth mechanisms. The 
model is governed by three tunable parameters p, q, and m, which control the relevant 
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Figure 1. Illustration of a deterministically growing network in the case of p — 1, 
q — 2, and m = 2, showing the first three steps of growing process. 

network characteristics. Our networks have a power-law degree distribution with degree 
exponent changeable between 2 and 3, a very large clustering coefficient, and a small- 
world feature. The proposed model considers systematic reorganization of cliques as its 
building block, which is helpful for understanding development processes and controls 
in real-world networks. 

2. Network construction 

We construct the networks in a recursive manner and denote the networks after t 
generations by Q{q,t), q > 2,t > 0. Figure [1] shows the network growing process 
for a particular case of p = 1, g = 2, and m = 2. The networks are constructed as 
follows: For t = 0, Q{q,0) is a complete graph Kg+i (or (g + l)-clique). For t > 1, 
Q{q,t) is obtained from Q{q,t — 1). For each of the existing subgraphs of Q{q,t — 1), 
with probability p (0 < p < 1), m (m is a positive integer) new vertices are created, and 
each is connected to all the vertices of this subgraph. The growing process is repeated 
until the network reaches a desired order. 

There are at least three limiting cases of our model listed below, (i) When q = 2, 
p = 1, and m = 1, the networks are exactly the same as the pseudofractal scale-free 
web [32j. (ii) When g = 2, p — > (but p 0), and m = 1, our model is reduced to 
the scale-free network with size-dependent degree distribution |33]. (iii) When q = 2, 
< p < 1, and m = 1, our networks coincide with the stochastically growing scale-free 
network described in Ref. [MJ. (iv) When q > 2, p = 1, and m = 1, our networks 
reduce to the recursive graphs discussed in Ref. [35] . 

Next we compute the numbers of nodes (vertices) and links (edges) in Q{q,t). Let 
Ly{t), Le{t) and Kg t be the numbers of vertices, edges and g-cliques created at step 
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t, respectively. Note that the addition of each new node leads to q new g-cliques and 
q new edges. So, we have Le{t) = Kq^t = qL^(t). Then, at step 1, we add expected 
= mp{q + 1) new nodes and Le{l) = mpq{q + 1) new edges to Q{q,0). After 
simple calculations, one can obtain that at > 1) the numbers of newly born nodes 
and edges are Ly{ti) = mp{q + 1)(1 + mpqY^~^ and Leiti) = mpq{q + 1)(1 + mpqY''^, 
respectively. Thus the average number of total nodes Nt and edges Ef present at step t 
is 

A,, = ^ = to + l)l(mp, + l)' + ,-l| 



ti=0 



and 



respectively. So for large t, The average degree kt = ^ is approximately 2q. 



3. Topological properties 

Topology properties are of fundamental significance to understand the complex 
dynamics of real-life systems. Here we focus on three important characteristics: degree 
distribution, clustering coecient, and average path length, which are determined by the 
tunable model parameters p, q, and m. 



3.1. Degree distribution 

When a new node i is added to the networks at step ti, it has degree q and forms q 
g-cliques. Let Lq{i, t) be the number of g-cliques at step t that will possibly created new 
nodes connected to the node i at step t + 1. At step ti, Lq{i,ti) = q. By construction, 
we can see that in the subsequent steps each new neighbor of i generated g — 1 new 
g-cliques with i as one vertex of them. Then at step ti + 1, there are mpq new nodes 
which forms mpq{q — 1) new g-cliques containing i. Let ki{t) be the degree of i at step 
t. We can easily find following relations for t > U + l: 

Ak,{t) = ki{t) - ki{t - 1) = mpLq{i, t - 1) (3) 

and 

Lq{i, t) = Lq{t, t - 1) + (g - l)Ah{t). (4) 

From the above two equations, we can derive: Lq{i,t + 1) = Lq{i,t)[l + mp{q — l)]. Since 
Lq{i,ti) = g, we have Lq(i, t) = g[l-t-mp(g — andAA;j(t) = mpq[l+mp{q — l)Y~^'~^. 
Then the degree ki{t) of node i at time t is 

Since the degree of each node has been obtained explicitly as in Eq. ([5]), we 
can get the degree distribution via its cumulative distribution [3J, i.e., Pcum{k) = 
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Y.k'>k^{k',t)/Nt ~ '>', where N{k',t) denotes the number of nodes with degree k'. 
The detailed analysis is given as follows. For a degree k 
f [1 + mp{q - 1)Y-' + q - 2 \ 

there are L^{s) = mp{q + 1)(1 + mpq)^~^ nodes with this exact degree, all of which were 
born at step s. All nodes born at time s or earlier have this or a higher degree. So we 
have 

E N(k', t) = t L,M = fa+l)l(-^M+l)' + <;-ll 

k'>k a=0 ^ 



As the total number of nodes at step t is given in Eq. ([T]) we have 



[1 + mp{q - 1)]*- + q-2'''-' (.-M)fC>np.-M)--f.-i] 



^-1 

Therefore, for large t we obtain 



{q+l)[{mpq+iy+q-l] 



[l + mg(g — 1)] * = {1 + mpq) 



1-7 



\s-t 



and 



ln(l + mpq) 

' \n[l + mp{q - 1)]' ^ ^ 

Thus, the degree exponent 7 is a continuous function of p q, and m, and belongs to the 
interval [2,3]. For any fixed q, asp decrease from 1 to 0, 7 increases from ^ + 1^^^:^^^^ to 
2+ ,rn{q-i) (^^^ Appendix A for the theoretic calculation of distribution for the particular 
case of m = 1). In the case g = 2, 7 can be tunable between 1 + and 3. In some 
limiting cases, Eq. recovers the results previously obtained in Refs. [32l [331 Ell 135] . 
Figure [2] shows, on a logarithmic scale, the scaling behavior of the cumulative degree 
distribution Pcum{k) for different values of p in the case of g = 2 and m = 1. Simulation 
results agree very well with the analytical ones. 



3.2. Clustering coefficient 

In the network if a given node is connected to k nodes, defined as the neighbors of 
the given node, then the ratio between the number of links among its neighbors and 
the maximum possible value of such links k{k — l)/2 is the clustering coefficient of the 
given node |9]. The clustering coefficient of the whole network is the average of this 
coefficient over all nodes in the network, and can take on values between and 1, the 
latter corresponding to a maximally clustered network where all neighbors of a node are 
linked to one another. 

For our networks, the analytical expression of clustering coefficient C{k) for a single 
node with degree k can be derived exactly. When a node is created it is connected to 
all the nodes of a g-clique, in which nodes are completely interconnected. So its degree 
and clustering coefficient are q and 1, respectively. In the following steps, if its degree 
increases one by a newly created node connecting to it, then there must be g — 1 existing 
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Figure 2. The cumulative degree distribution Pcurn{k) at various p values for the case 
of g = 2 and m = 1. The circles (a), squares (b), stars (c), and triangles (d) denote 
the simulation results for networks with different evolutionary steps t = 1350, t = 25, 
t — 16, and t — 13, respectively. The four straight lines are the theoretical results of 
j{d,q) as provided by equation (6). All data are from the average of 50 independent 
runs. 



neighbors of it attaching to the new node at the same time. Thus for a node of degree 
k, we have 

which depends on both k and q. For k ^ q, the C{k) is inversely proportional to degree 
k. The scaling C{k) ~ k~^ has been found for some network models [22l [23l [2^ 
[271 [321 [SSI [SH [SSI [SS] , and has also been observed in several real-life networks 
Using Eq. ([7j), we can obtain the clustering Ct of the networks at step t: 

1 * 2(g -l)(D,-f )L,(r) 

^'~N,h D^D. - 1) ' ^ ^ 

where the sum runs over all the nodes and Dr is the degree of the nodes created at step 
r, which is given by Eq. ([5]). 

In the infinite network order limit (A^^ oo), Eq. ([8]) converges to a nonzero value 
C. Obviously, network clustering coefficient Ct is a function of parameters p, q, and m. 
If we fixed any two of them, Ct increases with the rest. Exactly analytical computation 
shows: in the case q = 2 and m = 1, when p increases from to 1, C grows from 0.739 
[37] to 0.8 [32]; In the case p = 1 and q = 2, when m increases from 1 to infinite, C 
grows from 0.8 [32] to 1; Likewise, in the case p = 1 and m = 1, C increases from 0.8 to 
1 when q increases from 2 to infinite, with special values Ct = 0.8571 and Ct = 0.8889 
for g = 3 and q = 4, respectively. Therefore, the average clustering coefficient is very 
large, which shows the evolving networks are highly clustered. Figure [3] exhibits the 
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dependence of the clustering coefficient C on p, q and m, which agree well with our 
above conclusions. 

From Figs. [2] and [3] and Eqs. Q) and ([8]), one can see that both degree exponent 
7 and clustering coefficient Ct depend on the parameter p, q, and m. The mechanism 
resulting in this relation should be paid further effort. The fact that a biased choice of 
the cliques at each evolving step may be a possible explanation, see Ref. |38j . 



3.3. Average path length 



Denote the network nodes by the time step of their generations, v = 1,2,3, . . . , N — 1, N. 
Using L{N) to represent the APL of the our model with system size N, then we have 
following realtion: L{N) = , where cr(A^) = J2 



i<i<j<N di,j is the total distance, in 



which dij is the shortest distance between node i and node j. By using the approach 
similar to that in Refs. [2TI EH we can evaluate the APL of the present model. 

Obviously, when a new node enters the networks, the smallest distances between 
existing node pairs will not change. Hence we have 



N 



aiN+l)=aiN) + J2d,,^+i. 



i=l 



Equation 



where 



can be approximately represented as: 

a{N + 1) = cr{N) + N + {N - q)L{N -q + l). 



{N -q)L{N -q + l) 



2a{N-q + l) 2a{N) 



< 



N -q+l N 
Equations (fTOjl and (fTTj) provide an upper bound for the variation of cr^N) as 



da{N) 
dN 



N + 



2a{N) 



(9) 

(10) 
(11) 

(12) 
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Figure 4. Semilogarithmic graph of the APL vs the network size TV in the special 
case of TO = 1 . Each data point is obtained as an average of 50 independent network 
reahzations. The Uncs arc Uncar functions of hi N. 



which yields 

a{N) = N'^{\nN + uj), (13) 

where is a constant. As o^{N) ~ A^^ In A^, we have L{N) ~ In A^. 

Note that Eq. f|T2|) was deduced from an inequahty, which imphes that the increasing 
tendency of L{N) is at most as InA^ with A^. Thus, our model exhibits the presence of 
small-world property. In Fig. HJ we show the dependence of the APL on system size A^ 
for different p and q in the case of m = 1. From Fig. HI one can see that for fixed q, APL 
decreases with increasing q; and for fixed p, APL is a decreasing function of q. When 
network size A^ is small, APL is a linear function of In A^; while A^ becomes large, APL 
increases slightly slower than In A^. So the simulation results are in agreement with the 
analytical prediction. It should be noted that in our model, if we fix p and q, considering 
other values of m greater than 1, then the APL will increase more slowly than in the 
case m = 1 as in those cases the larger m is, the denser the network becomes. 

Here we only give an upper bound for APL, which increases slightly slower than 
In A^. Especially, in the case of p = 1, the networks grow deterministically, and we can 
compute exactly the diameter, which is the maximum distance between all node pairs 
of a graph. In this particular case, the diameter grows logarithmically with the network 
size [211 [27]. 

4. Conclusion 

In summary, we have proposed and studied a class of evolving networks consist of cliques, 
reminiscent of modules in biological networks or communities in social systems. We have 
obtained the analytical and numerical results for degree distribution and clustering 
coefficient, as well as the average path length, which are determined by the model 
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parameters and in accordance with large amount of real observations. The networks are 
power-law, with degree exponent adjusted continuously between 2 and 3. The clustering 
coefficient of single nodes has a power-law spectra, the network clustering coefficient is 
very large and independent of network size. The intervertex separation is small, which 
increases at most logarithmically as the network size. Interestingly, our networks are 
formed by cliques, this particularity of the composing units may provide a comprehensive 
aspect to understand some real-life systems. 
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Appendix A. Exact degree distribution for some limiting cases 

When p — > (but p 0) and m = 1, our model turns out to be the graph which 
evolves as follows (see [M] for interpretation): starting with a (q+l)-clique (t = 0), 
at each time step, we choose an existing g-clique, then we add a new node and join it 
to all the nodes of the selected g-clique. Note that when q= 2, the particular model 
gives the network studied in detail in Ref. [33]. Since the network size is incremented 
by one with each step, here we use the step value t to represent a node created at this 
step. Furthermore, after a new node is added to the network, the number of g-cliques 
increases by q. We can see easily that at step t, the network consists ofA^ = t + g + l 
nodes and Ng = qN — + 1 cliques. 

One can analyze the degree distribution mathematically as follows. Given a node, 
when it is born, it has degree q, and the number of g-clique containing this node is 
also q. After that, when its degree increases by one, the number of g-cliques with 
this node as one of its components increases by g — 1, so the number of g-cliques for 
selection containing a node with degree is (g — l)k — g^ + 2g. We denote by P/ 



k,N 



the fraction of nodes with degree k when the network size is N. Thus the number of 
such nodes is A^P^.Af- Then the probability that the new node happens to be connected 
to a particular node i having degree ki is proportional to (g — l)fcj — g^ + 2g, and so 
when properly normalized is just [(g — l)ki — g^ + 2q]/{qN — g^ + 1). So, between the 
appearance of the iVth and the (A^ + l)th node, the total expected number of nodes 
with degree k that gain a new link during this interval is 

'^"^''7!;'' X (A.1) 

qN — g^ + 1 g 
which holds for large N. Observe that the number of nodes with degree k will decrease 
on each time step by exactly this number. At the same time the number increases 
because of nodes that previously had k — 1 degrees and now have an extra one. Thus 
we can write a master equation for the new number {N + l)Pfc^7v+i of nodes with degree 
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k thus: 

{N + l)Pk,N+i = NPk,N + ^ [{k - l)Pk^i,N - kPk,N] . (A.2) 

The only exception to Eq. (lA.2p is for nodes having degree q, which instead obey the 
equation 

(iV + l)P,,iv+i = iVP,,^ + 1 - ^gP,,7v, (A.3) 

since by construction exactly one new such node appears on each time step. When 
approaches oo, we assume that the degree distribution tends to some fixed value 
Pk = limTv^oo P/v,fc- Then from Eq. (]A.3I) . we have 

Pg = (A.4) 

And Eq. (|Aj) becomes 

Pk = ^ [{k - l)Pfe_i - kPk] , (A.5) 

from which we can easily obtain the recursive equation 

k-1 



which can be iterated to get 

{k-l){k-2)...q 
' (fc + l + ^)(fc+^)...(g + 2 + ^) ^ 
(k-l){k-2)...{q + l) 



(A.6) 



(fc + l + ^)(^+^)...(g + 2 + ^)' ^^-^^ 

where Eq. ( 1A.4P has been used. This can be simplified further by making use of a handy 
property of the F-function, r(a) = (a — l)T{a — 1) with T{a) defined by: 

POD 

T(a) = / x'^^^e-'^dx. (A.8) 
Jo 

By this property and r(l) = 1, we get 

p _ (g + i + ^)(g + ^)---(2 + ^)r(fc)r(2 + ^) 

' r(fc + 2 + ^) 

_(g + l + ^)(g + ^)---(2 + ^)^/, ^ , 1 



g(g-l)...l B(^^'2+^ 



(A.9) 



where B(a, 6) is the Legendre beta-function, which is defined as 

B(.,.)^i|m, ,.,0, 

Note that the beta-function has the interesting property that for large values of either 
of its arguments it itself follows a power law. For instance, for large a and fixed b, 
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B(a, 6) ~ a Then we can immediately see that for large k, also has a power-law 
tail with a degree exponent 

7 = 2 + ^. (A.ll) 
q-1 

For g = 2, 7 = 3, which has been obtained previously in Ref. [Mj . 

Equation (1A.9I) is similar to the Yule distribution [39] called by Simon [ID]. In fact, 
this particular case of our model can be easily mapped into the Yule process, which was 
inspired by observations of the statistics of biological taxa, from this perspective our 
model may find applications in biological systems. 
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